chart presents  clear evidence that outcome data  is  the  least collected   type  and
that  the   best  sources are NAEP,   National Longitudinal Study,   and  the
International Association data base.

NAEP,   in  fact,   is  the only Department of Education data base that  focuses  on
outcomes   and  it  is funded and managed by  the National  Institute  of  Education.

Walberg,   who urges  the  creation  of a National Bureau of  Educational Standards,
argues for the establishment of absolute measures.     He  cites   the psychometrist
John Carroll  in noting  that in  1925,  L.L.   Thurston had attempted  to  calibrate
mental abilities and tasks  to chronological age and learning  time.     If
Thurston's work had been continued,  we might be in a much better position today
to actually have both absolute measures and a wide range  of  longitudinal
information.     Noting that the athletic world has  the  finest set of performance
measures,   Walberg says  of educational measurement,   "It  is  as   though  each  test
publisher and teacher had a different meter stick;   and yet  there  is  no way   to
equate them."

We need  test scores.    No one recommends their elimination.     But we  need  to keep
in mind  the  inherent shortcoming of  test scores,   particularly  the  lack of
comparability.     One author suggests that test scores be accompanied by
descriptions of what was  tested  (B.  Turnbull).

One possible solution which is advanced is  to calibrate  tests with a national
standard  test,   like NAEP.    Although that would be  limited  to  three  age   levels,
it would  be a step in  the direction Walberg seeks.

Another author  (Harrison)  noted that in the early 1970's,   the U.S.   Office  of
Education  (OE) spent a great deal of time and money on the development of  auch
an equating instrument,   the Anchor Test by Dr.  Charles Hammer.     Regrettably,   it
was  neither publicized nor used by OE.    An appropriate area of inquiry at  this
juncture might be to re-examine the Anchor Test to determine if  its  resurrection
is possible.    Harrison warns against either a federal or State attempt  to design
a  testing program to make outcome comparisons while urging  that States  agree  on
a  set of  achievement tests that could be administered by each State  or  the
creation  of an equation device like  the Anchor Test.

Smith also addresses the problem of the  lack of correspondence  between   tests.
The High School and Beyond survey is the focus of his comments in which  he notes
problems  with the quality of HSB student achievement data and the  nature  of   the
concepts measured by HSB relative to the methodology used for the  testing.
Smith also notes problems in articulation with International Evaluation of
Achievement, NAEP,  and State assessments.

Lehnen presents a case study on how one State legislature   (Indiana)   used NCES
data   to compare resources and performance between that State and other States.
In the area of output measures,   Indiana used three sets of NCES-supplied
numbers:                                                                                                 ^

o Median years of education,

o Percent graduating from high school,   and

o Average SAT scores  (for 22 states).

42ts in such areas as physics
